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Abstract.  In  this  paper  we  consider  approaches 
to  updating  databases  containing  null  values  and  in- 
complete  information.  Our  approach  distinguishes  be¬ 
tween  modeling  incompletely  known  worlds  and  model¬ 
ing  changes  in  these  worlds.  As  an  alternative  to  the 
open  and  closed  world  assumptions,  wc  propose  the 
modified  closed  world  assumption.  Along  with  the  dis¬ 
cussion  of  updating,  we  address  some  issues  of  refining 
incompletely  specified  information.^ — ^ 

Key  Words  and  Phrases.  Null  values,  incomplete 
information,  updates,  databases,  relational  databases. 

CR  Categories.  H.2.1,  11.2.3,  II.l.l. 

1.  Introduction 

The  real  world,  and  a  database  that  models  it,  varies 
over  time.  At  each  moment  in  time,  we  have  a  world 
state  and  a  corresponding  database  state.  As  the 
world  state  changes  with  time,  we  want  the  database 
to  track  these  changes.  Wc  distinguish  between  these 
two  problems,  that  of  modeling  an  incompletely  known 
world  and  of  modeling  changes  in  that  world.  In  the 
sections  below,  we  first  discuss  modeling  incompletely 
known  static  worlds,  with  updates  serving  the  purpose 
of  refining  the  database  when  more  complete  informa¬ 
tion  is  known.  Following  this  discussion,  wc  address 
some  issues  concerning  the  use  or  updates  to  handle 
changes  in  the  world  state. 

la.  Why  do  we  have  incomplete  information 
about  the  real  worldf 

Incomplete  information  arises  from  several  sources.  At 
first,  in  the  initial  stages  of  using  a  new  system,  not  all 
of  the  necessary  information  may  have  been  captured, 
resulting  in  an  incompletely  known  static,  world  state. 
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Later  on,  new  information  may  become  available  in  a 
piecemeal  fashion,  resulting  in  an  incompletely  specified 
change  to  a  world  state.  In  addition,  for  privacy  or 
security  reasons  we  may  not  want  to  store  particular 
information  for  certain  members  of  a  domain,  giving 
us  an  incompletely  known  static  world  state.  Some 
information  may  be  omitted  because  it  is  quite  ex¬ 
pensive  or  difficult  to  obtain.  In  a  shared  database, 
the  responsibility  for  capturing  the  information  may 
be  decentralized.  Users’  views  may  omit  information 
stored  in  the  database  [Chamberlin  75,  Stonebraker 
75].  Consequently,  view  updates  [Dayal  82,  Keller  82] 
often  result  in  incomplete  information.  Finally,  some 
attributes  may  be  inapplicable  in  a  particular  situation, 
indicating  that  the  structure  of  the  database  model  does 
not  exactly  correspond  to  the  structure  of  the  world. 

lb.  Alternative  Worlds 

Given  an  incomplete  body  of  knowledge  about  a  world, 
we  expect  to  find  multiple  worlds  satisfying  that  body 
of  knowledge.  If  we  consider  the  body  of  knowledge  to 
be  a  theory,  then  the  possible  worlds  are  models  that 
satisfy  that  theory. 

We  may  choose  to  apply  constraints  to  the  relation¬ 
ship  between  these  models  and  the  original  theory.  One 
such  constraint,  known  as  the  open  world  assumption, 
states  that  the  theory  is  correct  but  not  necessarily  com¬ 
plete.  That  is,  if  the  negation  of  a  fact  can  be  derived 
from  the  theory,  then  that  fact  must  be  false  in  all 
models.  There  can  be  facts  true  in  sonic  models  (and 
false  in  others)  that  are  not  conclusively  specified  in 
the  theory.  This  gives  us  three  classes  of  statements: 
those  true  in  all  models,  those  false  in  all  models,  and 
those  true  in  some  models  and  false  in  others  (hereafter 
referred  to  as  “true,”  “false,”  and  “maybe”  statements 
or  results,  respectively).  We  shall  use  the  term  definite 
results  to  refer  to  the  “true”  and  “false”  results. 

Another  constraint,  the  closed  world  assumption 
[lleiter  78,  80],  states  that  all  relevant  information  is 
given  in  the  database.  That  is,  ir  a  fact  cannot  be 
derived  from  the  theory,  its  negation  may  be  assumed 
to  hold.  A  database  is  consistent  with  the  dosed  wo^Jd 
assumption  if  the  set  of  facts  not  derivable  from  the 
database  is  consistent  with  the  database,  taken  as  a 
theory.  Definite  databases  (those  not  containing  dis¬ 
junctions)  are  consistent  with  the  closed  world  assump¬ 
tion.  In  particular,  databases  containing  disjunctions 
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of  multiple  positive  terms  are  oot  consistent  with  the 
closed  world  assumption.  Under  the  closed  world  as¬ 
sumption,  there  is  only  one  mode]  of  the  theory,  so  there 
are  no  “maybe"  statements.  This  is  the  usual  model  for 
database  theories  not  containing  nulls. 

A  third  constraint,  developed  at  length  by  Levesque 
[80,  82],  may  be  called  the  modified  closed  world  as¬ 
sumption.  In  this  case,  the  theory  may  explicitly  state 
where  its  knowledge  is  incomplete.  The  theory  may 
contain  disjunctions,  and  as  in  the  open  world  assump¬ 
tion,  it  may  have  multiple  models.  All  facts  true  in  any 
particular  model  of  a  modified  closed  world  theory  must 
be  derivable  either  as  part  of  a  disjunction  explicitly 
mentioned  in  the  theory,  or  else  derivable  from  such 
disjunctions.  All  facts  not  derivable  from  such  com¬ 
binations  of  the  disjunctions  arc  assumed  to  be  false. 
This  assumption  permits  “true,”  “false,”  and  “maybe” 
statements;  however,  many  of  the  "maybe”  statements 
in  a  given  database  under  the  open  world  assumption 
will  become  false  under  the  modified  closed  world  as¬ 
sumption.  In  a  relational  database,  the  disjunctions  can 
appear  at  four  levels:  disjunctions  of  values,  of  tuples, 
of  relations,  and  of  databases.  Definite  database  models 
of  an  indefinite  database  are  obtained  by  choosing  one 
of  each  of  the  disjuncts,  provided  that  the  result  mg 
database  satisfies  all  constraints. 

In  the  work  below,  we  restrict  our  attention  to 
databases  under  the  modified  closed  world  assumption. 
Further,  we  will  only  consider  the  least  expressive  and 
most  tractiblc  levels  of  disjunction,  those  of  values  and 
of  tuples  with  true. 

Let  us  present  some  examples.  Consider  the  fol¬ 
lowing  database. 


Haas 

Address 

Tala phono 

Susan 

Apt  7  or  12 

655*0123 

Pat 

Apt  7 

665-9876 

Sandy 

Apt  17 

non# 

Georg# 

Apt  9 

unknown 

Who  is  in  Apt  7?  The  “true”  result  is  Pat,  and  the 
“maybe”  result  is  Susan. 

Is  Susan  in  Apt  7  or  Apt  12?  We  would  like 
to  answer  “yes”;  after  all,  it  is  necessarily  true  that 
Susan  may  be  found  at  one  or  both  of  these  addresses. 
However,  we  have  a  potential  problem  in  that  this  query 
is  not  equivalent  to  the  disjunction  of  the  queries  “Is 
Susan  in  Apt  7?”  and  “Is  Susan  in  Apt  12?”;  for 
the  answer  to  this  disjunction  is  “maybe.”  The  query 
answering  algorithm  must  expend  particular  efTort  to 
deduce  the  “yes”  answer  rather  than  the  “maybe” 
answer. 

Who  docs  not  have  a  phone  starling  with  555? 


The  “true”  result  is  Sandy,  and  the  “maybe”  result  is 
George. 

2.  Incompleteness  and  Relational  Databases 
How  can  the  relational  model  [Codd  70,  79,  82,  Maier 
83,  Ullman  83]  be  extended  to  include  incomplete  in¬ 
formation?  Let  us  assume  that  the  relational  model  is 
capable  of  representing  the  relevant  portion  of  the  real 
world,  were  the  necessary  information  available.  Then 
we  shall  explore  extensions  to  the  relational  model  to 
support  various  levels  of  incompleteness. 

The  standard  relational  model  consists  of  a  set  of 
relation  schemas  and  a  set  of  constraints.  Each  relation 
schema  has  a  set  of  labelled  domains  called  attributes. 
A  relation  is  an  unordered  set  of  tuples,  each  tuple  as¬ 
suming  a  value  for  each  attribute.  We  will  use  the  term 
attribute  value  to  refer  to  the  value  of  a  particular  at¬ 
tribute  for  a  specified  tuple.  First  normal  form  requires 
that  each  attribute  for  each  tuple  be  an  atomic  value, 
that  is,  one  value  in  its  domain. 

A  simple  incompleteness  that  may  exist  is  that  we 
may  have  only  partial  information  available  about  an 
entity  whose  identity  wc  know.  If  entity  names  may 
serve  as  keys  to  the  relation,  this  corresponds  to  a  tuple 
with  a  known  key  value,  but  with  non-atomic  values 
for  some  attributes.  This  situation  violates  first  normal 
form  in  that  wc  cannot  assign  a  specific  value  to  every 
attribute  in  the  corresponding  tuple. 

Let  us  consider  the  types  of  such  non-atomic  values. 
The  ANS1/X3/SPARC  study  group  Tor  database  man¬ 
agement  systems  specifications  generated  a  list  or  14 
different  manifestations  of  null  values  [ANSI  75],  for 
which  we  propose  a  taxonomy  as  follows.  First,  it 
may  be  that  no  domain  value  is  applicable  for  an  at¬ 
tribute;  consider,  for  example,  the  value  of  the  attribute 
Supcrvisor’s-Namc  for  the  president  of  a  company.  We 
call  this  value  inapplicable.  The  second  case  occurs 
when  the  value  is  known  to  be  in  a  particular  set  of 
values,  perhaps  including  inapplicable.  This  concept  of 
a  scl  null  includes  null  values  spccifcd  as  ranges  (for  ex¬ 
ample,  20  <  Age  <  30).  Using  an  example  from  Section 
lb,  {Apt  7,  Apt  12}  is  a  set  null.  In  the  case  where  an 
attribute  is  applicable  for  a  tuple  but  no  further  infor¬ 
mation  is  known,  the  set  null  is  the  entire  domain  of 
the  attribute. 

Note  that  the  choice  of  sets  as  a  representation 
formalism  need  not  be  restricted  to  null  values:  Any 
singleton  set  other  than  the  value  inapplicable  rep¬ 
resents  a  non-null  value.*  Wc  may  regard  all  occur¬ 
rences  of  single  values  as  degenerate  cases  of  set  nulls. 


*Rven  when  the  nulls  merely  signify  "no  information, *  there  are 
problems  in  answering  queries  to  databases  with  nulls  [Keller  84). 
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Almost  all  types  of  nulls  considered  in  the  literature  are 
(possibly  restricted)  cases  of  set  nulls. 

2a.  Objects 

The  concept  of  objects  permits  constraints  on  the 
appearance  of  inapplicable  null  values  in  relational 
databases  [Goldstein  81,  Maier  80,  83,  Sciore  80].  In 
brief,  a  relation  can  be  divided  into  a  set  of  relations, 
all  with  the  same  key  or  primary  attributes,  so  that 
desirable  information  can  be  recorded  solely  by  creat¬ 
ing  tuples  without  inapplicable  [Codd  79,  Kl-Masri  79, 
80,  Wicdcrhold  83].  If  the  logical  database  design  cor¬ 
responds  in  this  manner  to  the  objects  identified,  and  we 
assume  that  no  null  values  are  allowed  in  the  primary 
attributes  for  an  entity,  we  will  never  need  the  null  value 
inapplicable .  The  possibility  of  an  attribute  being  inap¬ 
plicable  for  a  given  tuple  can  be  handled  by  attaching 
a  condition  to  the  tuple,  as  described  in  the  next  sec¬ 
tion.  In  the  discussion  of  updates  and  refinement,  we 
assume  that  inapplicable  nulls  have  been  eliminated  in 
this  fashion. 

2b.  On  Representation  of  Incompleteness 

In  this  section,  we  will  summarize  ways  of  repre¬ 
senting  alternative  worlds.  A  set  of  alternative  worlds, 
each  describable  by  a  relational  database,  may  also  be 
described  by  a  single  database  with  conditions  attached 
to  tuples.  However,  it  is  difficult  to  compute  solutions  to 
queries  for  a  database  expressed  in  this  form.  Therefore, 
wc  also  present  other  representations  that  arc  more  con¬ 
ducive  to  manipulation. 

Conditional  relation  This  is  the  most  expressive 
form  for  describing  incompleteness  in  extended  rela¬ 
tional  databases.  A  conditional  relation  is  the  exten¬ 
sion  of  an  ordinary  relation  to  contain  one  additional 
attribute,  a  condition  to  be  applied  to  each  tuple.  A 
tuple  with  a  condition  appended  is  called  a  conditional 
t up/e,  and  it  may  appear  in  query  “maybe”  results. 

Wc  can  identify  several  classes  of  simple,  useful 
conditions.  The  possible  condition  is  for  tuples  where 
no  specific  information  is  known  about  the  conditions 
under  which  the  tuple  will  exist.  In  other  words,  the 
existence  of  a  possible  tuple  is  independent  of  the  state 
of  the  remainder  of  the  database. 

A  second  class  consists  of  sets  of  alternative  tuples. 
A  set  of  alternative  tuples  is  called  an  a/ternative  set. 
Precisely  one  of  the  members  of  an  alternative  set 
must  exist  in  any  model  of  an  incomplete  database. 
Alternative  tuples  arc  simply  a  generalization  of  null 
values  to  null  tuples,  of  set  nulls  to  set  tuples.  In 
contrast  to  alternative  tuples,  any  number  of  possible 
tuples  may  hold  in  auy  alternative  world. 


Another  class  of  conditions — called  predicated — is 
explored  by  Imiclinski  and  Lipski  [81].  It  consists  of 
expressions  built  up  from  atomic  conditions  using  con¬ 
junction,  disjunction,  and  negation.  The  atomic  forms 
are  true,  false ,  and  comparisons  between  an  attribute 
and  a  definite  value  or  between  two  attributes.  The 
largest  class — called  arbitrary — consists  of  any  rela¬ 
tional  expression  that  can  be  applied  to  ordinary  rela¬ 
tional  databases. 

In  this  paper  we  will  restrict  our  attention  to  pos¬ 
sible  conditions.  Due  to  space  limitations,  we  cannot 
also  cover  a/ternative  conditions. 

Set  nulls  The  inclusion  of  set  nulls  in  conditional 
relations  makes  it  possible  to  represent  the  relations 
more  concisely  and  makes  it  easier  to  compute  answers 
to  queries,  without  increasing  the  expressive  power  of 
conditional  relations. 

Predicates  Predicates  are  data-dependent  con¬ 
straints  applied  to  null  values.  For  the  purposes  of  this 
paper,  the  most  useful  predicate  is  marked  nulls,  which 
denote  known  equality  of  the  actual,  unknown  values 
of  nulls.  Two  marked  nulls  with  the  same  marking  are 
known  to  have  the  same  actual,  unknown  value,  but 
two  marked  nulls  with  differing  marks  may  or  may  not 
have  the  same  actual,  unknown  value.  More  complex 
predicates  are  also  possible,  but  wc  shall  not  consider 
them  here. 

3.  Updating  Incomplete  Databases  That  Model 
Static  Worlds 

Each  incomplete  database  corresponds  to  a  set  of  alter¬ 
native  worlds  that  the  database  models.  When  we  up¬ 
date  an  incomplete  database,  the  update  will  fall  into 
one  of  two  categories.  The  first  type  of  update  adds 
previously  unknown  information  about  a  static  world 
situation  and  generates  a  new  set  of  alternative  worlds 
that  is  a  subset  of  the  original  group  of  alternative 
worlds.  This  type  of  update  is  relatively  easy  to  imple¬ 
ment  correctly.  The  second  type  of  update  involves 
tracking  changes  in  a  dynamic  world  or  which  wc  have 
incomplete  knowledge.  Static  world  know  ledge- ad  ding 
updates  are  discussed  in  this  section,  and  the  discussion 
of  dynamic  world  updates  is  deferred  to  Section  4. 

It  is  important  to  note  that  under  the  modified 
closed  world  assumption,  auy  entity  that  is  known  Cp 
possibly  participate  in  a  relation  should  be  represented 
by  a  separate  tuple  in  that  relation,  perhaps  with  a  pos¬ 
sible  condition  attached.  Otherwise,  the  introduction 
of  any  tuple  that  implies  the  participation  or  that*  en¬ 
tity  in  the  relation  must  be  treated  as  a  dynamic  world 
change- recording  update.  • 
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There  are  two  concerns  in  updating  an  incomplete 
database  that  models  a  static  world.  The  first  of  these, 
the  nature  of  the  updates  and  how  to  specify  them, 
will  be  discussed  in  Section  3a.  The  second  concern 
is  how  to  assimilate  the  update  into  the  database  via 
refinement.  Refinement  has  appeared  in  the  literature 
[lmiclinski  83,  Maier  83,  Osborn  81,  Walker  80j,  and  it 
will  be  considered  in  Section  3b. 

3a*  Updates  in  a  Static  World 
Updates  in  incomplete  databases  modelling  static 
worlds  serve  to  add  knowledge  to  the  database.  There 
may  be  many  alternative  worlds  satisfying  an  incom¬ 
plete  static  world  database,  and  updates  may  reduce 
the  number  of  these  possibilities.  For  example,  condi¬ 
tional  tuples  can  have  their  conditions  determined  or 
augmented.  Set  nulls  can  be  updated  by  eliminating 
some  alternatives  from  the  sets.  Additional  predicates, 
such  as  marked  nulls,  can  be  imposed  on  the  database. 

The  first  step  in  processing  an  update  is  to  deter¬ 
mine  the  “true”  and  “maybe”  results  of  its  selection 
clause.  This  is  a  difficult  problem  which  will  not  be 
treated  here  (sec  [Codd  79,  Maier  83]);  the  discus¬ 
sion  below  presupposes  a  successful  resolution  of  the 
query  answering  problem.  We  note  that  syntactic  ex¬ 
tensions  to  the  language,  with  accompanying  semantic 
definitions,  will  be  needed  to  designate  set  nulls.  In 
addition,  the  user  must  be  able  to  add  and  remove  pos¬ 
sible  conditions  in  updates  in  order  to  satisfy  the  re¬ 
quirements  of  the  modified  closed  world  assumption  and 
our  postulations  regarding  the  use  of  inapplicable  null 
values. 

Under  the  modified  closed  world  assumption,  dele¬ 
tions  have  no  place  in  a  static  world.  A  tuple  update 
consisting  of  a  deletion  followed  by  an  insert  operation 
will  violate  the  modified  closed  world  assumption  un¬ 
less  the  two  arc  bundled  into  the  same  transaction.  We 
use  the  convention  that  an  UPDATE  operation  specifies 
the  modification  of  an  entity  or  relationship  already 
in  the  database,  while  an  INSERT  operation  supplies 
information  about  a  new  entity  or  relationship.  In  a 
static  world  under  the  modified  closed  world  assump¬ 
tion,  UPDATE  requests  are  only  reasonable  to  the  ex¬ 
tent  that  they  supply  additional,  non-conflicting  infor¬ 
mation  about  existing  entities;  INSERT  requests  arc  not 
permitted,  for  there  can  be  no  new  entities. 

With  the  UPDATE  operator,  one  may  update  the 
“true”  results  of  a  selection  clause  as  usual,  with  some 
extra  attention  given  to  handling  marks.  But  what 
action  should  be  taken  on  the  “maybe"  result  of  the 
selection  clause?  In  a  static  world  under  the  modified 
closed  world  assumption,  an  update  can  only  serve  to 
narrow  the  range  of  choices  within  a  set  null.  Therefore, 


the  first  possibility  is  that  the  target  attribute  values 
do  not  already  include  the  new  values,  in  which  case 
the  tuple  cannot  be  in  the  “true”  result  of  the  selection 
clause.  A  sophisticated  query  processor  might  use  that 
fact  to  refine  certain  fields  of  the  failing  tuple.  The 
second  possibility  is  that  the  target  attribute  values  do 
already  include  the  new  values,  in  which  case  the  best 
action  in  our  model  is  simply  to  ignore  the  update.  The 
third  possibility  is  that  the  old  and  new  attribute  values 
are  set  nulls  with  a  partial  overlap  in  values;  in  this  case 
we  may  try  a  simple  technique  called  tup/e  splitting. 
Consider  the  following  example. 

Vassal  HomePort  Condition 

{Henry ,  Dahomey}  {Boston,  Charleston}  true 

UPDATE  [HomePort  :=  SETNULL  ({Boston,  Cairo})] 
WHERE  Vassal  =  "Henry* 

In  the  result,  we  have  split  the  original  tuple  into  two 
possible  tuples.  One  tuple  covers  the  case  that  the  tuple 
is  actually  in  the  “true”  result,  and  the  other  tuple 
covers  the  “false”  result  case.  If  we  perform  the  update, 
we  could  get  the  following  relation. 

Vassal  HomePort  Condition 

{Henry,  Dahomey}  {Boston,  Cairo}  possible 

{Henry,  Dahomey}  {Boston,  Charleston}  possible 

Note,  however,  that  the  Henry  could  not  be  in  Cairo 
because  that  was  not  permitted  in  the  original  database, 
and  wc  are  working  in  a  static  world  under  the  modified 
closed  world  assumption.  This  gives  us  the  following 
result. 

Vessel  HomePort  Condition 

{Henry,  Dahomey}  Boston  possible 

{Henry,  Dahomey}  {Boston,  Charleston}  possible 

It  is  a  very  difficult  problem  in  general  to  determine 
exactly  which  set  null  values  would  put  a  tuple  in  the 
“true”  result  and  which  would  put  it  in  the  “false” 
result.  However,  a  smarter  query  answering  algorithm 
might  be  able  to  produce  the  following. 

Vessel  HomePort  Condition 

Henry  Boston  possible 

Dahomey  {Boston,  Charleston}  possible 

Since  there  may  now  be  icro,  one,  or  two  ships,  this 
method  violates  the  modified  closed  world  assumption 
in  astatic  world.  Tins  problem  may  he  avoided  by  using 
an  alternative  set  containing  the  two  tuples,  so  that 
precisely  one  of  them  will  hold.  This  latter  approach 
incurs  other  problems  that  are  beyond  the  scope  or  this' 
paper. 
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Sb.  Refinement  in  a  Static  World 

Refinement  is  a  process  that  alters  the  state  of  the 
database  without  affecting  its  set  of  possible  worlds.  If 
coupled  with  a  query  answering  strategy  that  generates 
all  possible  worlds  and  then  performs  the  query  on  each 
of  them,  refinement  may  affect  the  efficiency  of  the 
computation  but  not  the  answers  to  queries;  otherwise, 
refinement  may  affect  the  answers. 

Refinement  simplifies  the  contents  of  the  database 
by  applying  known  dependencies  and  constraints  to 
establish  conditions  on  the  existence  of  null  values 
[Vassiliou  80,  Lien  79,  Maicr  83].  This  process  may  al¬ 
low  a  query  answering  strategy  to  provide  more  infor¬ 
mative  answers  to  queries.  Refinement  can  also  help 
to  verify  the  compatibility  of  updates  with  static  world 
databases,  thus  preventing  violations  of  dependencies 
and  integrity  constraints. 

Let  us  begin  by  considering  functional  depen¬ 
dencies.  In  our  model,  we  can  use  these  dependencies  to 
establish  when  two  nulls  must  have  the  same  mark.  For 
example,  suppose  Ship  ->  HomcPort  in  the  following 
relation. 

Ship  HoasPort 

fright  {Managua ,  Taipai} 

fright  {Talpai,  Paarl  Harbor} 

We  may  conclude  that  this  is  actually  the  following 
relation. 

Ship  HoaoPort 
fright  Taipoi 

We  have  eliminated  a  null  value,  enabling  us  to  give 
more  informative  answers  to  queries.  For  example,  if 
the  user  asks  for  a  list  of  all  ships  with  a  IlomePort  of 
Taipei,  then  the  Wright  will  be  in  the  “maybe”  result  for 
the  unrefined  database,  but  in  the  “true"  result  for  the 
refined  version.  More  generally,  suppose  we  are  given 
a  relation  with  A  — ►  B,  containing  two  tuples  with  set 
nulls  si  and  s?,  as  follows. 

A  B 

al  #i 

ml  «s 

Wc  may  refine  this  to  the  following  single  tuple. 

A  B 

si  Si  0  *3 

Similarly,  if  A  D  in  the  following  relation,  and  bl  and 
b2  arc  known  to  be  unequal,  then  wc  may  conclude  that 
al  and  a2  must  have  different  values.  Indeed,  cither  or 
both  of  bl  and  b2  may  be  set  nulls,  as  long  ns  the  sets 
have  no  elements  in  common. 


A  B 

al  bl 

a2  b2 

If,  say,  al  is  a  non-null  value,  then  wc  can  replace  a2 
by  a2  —  al.  That  is,  the  keys  of  the  two  tuples  must  be 
unequal. 

Functional  dependencies  can  also  be  used  to  refine 
the  conditions  appended  to  tuples.  For  example,  let 
A  — ►  D  in  the  following  relation. 

A  B  Condition 

al  bl  true 

al  bl  possible 

If  al  is  a  non-null  value,  this  refines  to  the  following 
relation. 

A  B  Condition 

al  bl  true 

Refinement  helps  to  catch  consistency  errors  that 
are  violations  of  known  dependencies.  (The  refinement 
process  is  similar  to  the  chase  algorithm  for  inference  of 
dependencies  [Ullman  83].)  The  presence  of  such  errors 
is  signalled  by  the  appearance  of  a  set  null  with  no 
elements  (the  empty  set).  For  example,  if  D  «2  in 
the  example  above  is  the  empty  set,  then  an  error  has 
occurred.  As  presented,  refinement  is  not  sufficient  to 
detect  all  violations  of  functional  dependencies,  nor  to 
eliminate  as  many  nulls  as  would  be  possible  with  a 
more  general  mechanism. 

We  have  given  some  simple  rules  for  refining 
databases  with  functional  dependencies.  One  may 
define  rules  in  a  similar  fashion  for  all  varieties  of 
generalised  dependencies. 

4.  Tracking  Changing  Worlds 
Let  us  now  consider  the  situation  where  the  database  is 
modeling  a  dynamic,  changing  world.  We  wilt  discuss 
issues  relating  to  updates  and  refinement  in  this  context. 

4a.  Updates  in  a  Changing  World 
In  the  discussion  below,  wc  consider  how  to  insert  tuples 
to  relations,  how  to  delete  them,  and  how  to  do  other 
kinds  of  updates.  These  updates  will  fall  into  two  cate¬ 
gories:  knowledge-adding  updates,  which  represent  new 
information  about  the  dynamic  world  at  one  particular 
moment  in  time,  and  change- recording  updates,  which 
track  changes  in  the  world  over  time.  We  will  cotfr 
sidcr  corrections  as  knowledge-adding  updates  if  the 
new  set  of  possible  worlds  is  included  in  the  original; 
otherwise  they  arc  change- recording  updates  because 
they  cause  a  transformation  to  a  different  set  of  posable 
worlds.  Equivalently,  lieforc  performing  a  knowledge- 
adding  update,  the  database  already  models  the  new  set 
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of  possible  worlds.  Change- recording  updates  are  par¬ 
ticularly  difficult  to  execute  correctly,  and  matters  are 
complicated  by  the  fact  that  it  is  not  usually  possible  to 
tell  whether  an  update  is  knowledge-adding  or  change¬ 
recording.  Knowledge- adding  updates  were  discussed  in 
the  previous  section;  here  we  consider  change- recording 
updates  and  how  to  proceed  when  the  type  of  an  update 
is  unknown. 

For  example,  consider  the  following  relation  and 
update. 

¥••••!  Port  Cargo 

Dahomey  Boston  Honey 

Vright  {Boston,  Newport}  Butter 

INSERT  [Vessel  :=  "Henry" ,  Cargo  :=  "Eggs", 
Port  :=  SETNULL  ({Cairo,  Singapore})] 

The  result  after  performing  the  update  on  the  relation 
is  as  follows. 

Vessel  Port  Cargo 

Dahomey  Boston  Honey 

Vright  {Boston,  Newport}  Butter 
Henry  {Cairo,  Singapore}  Eggs 

Under  the  modified  closed  world  assumption,  this  is 
a  change- recording  update  because  the  Henry  was  not 

previously  known  to  exist.  Unfortunately,  change- 

recording  insert  operations  can  interact  disastrously 
with  refinement  in  relations  with  functional  depen¬ 
dencies.  (We  will  discuss  this  further  in  Section  4b.) 

We  use  the  convention  that  an  UPDATE  operation 
specifics  the  modification  of  an  entity  or  relationship  al¬ 
ready  in  the  database,  while  an  INSERT  operation  sup¬ 
plies  information  about  a  new  entity  or  relationship. 
It  is  not  always  clear  whether  an  assertion  should  be 
treated  as  an  insertion  or  an  update.  This  is  especially 
true  when  the  update  is  specified  through  natural  lan¬ 
guage  [Davidson  84]. 

For  other  types  of  updates,  tuples  in  the  “true* 
result  of  the  selection  clause  can  be  updated  as  usual. 
For  “maybe"  results,  when  only  set  nulls  are  involved, 
the  first  option  is  to  do  nothing  and  expect  the  user 
to  explicitly  update  the  “maybe"  result  by  means  of  a 
truth  operator  in  the  selection  clause  [Codd  79,  Lipski 
79],  as  in  the  following  example. 

UPDATE  [Port  :=  Cairo] 

WHERE  MAYBE  (Port  =  "Cairo") 

Result: 

Vossol  Port  Cargo 

Dahoaoy  Boston  Honsy 

Vright  {Boston,  Nwwport}  But tor 
Honry  Cairo  Eggs 


As  a  second  option,  the  database  system  can  ex¬ 
plicitly  ask  the  user  on  the  fly  what  to  do  about  the 
“maybe”  results. 

As  a  third  option,  we  can  bravely  attempt  to 
automatically  update  the  “maybe"  results.  In  a  model 
of  conditional  tuples  and  set  nulls,  one  can  use  the  tuple- 
splitting  technique  of  Section  3b.  Consider  the  effect  of 
a  cargo  update  on  the  previous  relation. 

UPDATE  [Cargo  :=  "Guns"] 

WHERE  Port  =  "Boston" 

Vossol  Port  Cargo  Condition 

Dahomoy  Boston  Guns  true 

Vright  {Boston,  Newport}  Guns  possible 

Vright  {Boston,  Newport}  Butter  possible 
Honry  Cairo  Eggs  true 

We  have  given  the  original  tuple  a  possible  condition, 
created  a  duplicate,  and  then  performed  the  update  in 
place  on  the  new  tuple.  (The  two  null  values  {Boston, 
Newport}  would  be  given  the  same  mark.)  We  have 
generated  quite  a  few  new  alternative  worlds  for  the 
database.  To  reduce  this  diversification,  a  clever  query 
answering  algorithm  might  be  able  to  tell  us  which  set 
null  values  would  give  rise  to  “false"  result  tuples  and 
which  to  “true"  result  tuples.  With  such  an  algorithm, 
we  could  give  the  following  result  from  the  cargo  up¬ 
date. 

Vossol  Port  Cargo  Condition 
Dahomoy  Boston  Guns  true 
fright  Boston  Guns  possible 
Vright  Nowport  Buttor  possible 
Honry  Cairo  Eggs  true 

As  we  have  seen,  appending  possible  conditions  when 
splitting  tuples  generates  new  possible  worlds.  The  use 
of  an  a/lcrnativcset  for  the  split  tuples  avoids  this  prob¬ 
lem  at  the  expense  of  additional  complications  during 
future  updates,  a  consideration  beyond  the  scope  of  this 
paper. 

Another  potential  solution  is  null  propagation, 
where  fields  that  are  the  target  of  an  update  are  trans¬ 
formed  into  sot  nulls.  The  following  series  of  examples 
illustrates  this  technique. 

AB  C  + 

ABC 
▼l  {v2,  t3}  t2 
▼3 

UPDATE  [A  :*  C] 

VHERE  B  *  C 

Using  null  propagation,  we  obtain  the  following  relation 
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AB. 

A  B 

<▼1.  *2}  { t2 ,  v3} 

{▼1,  v3}  {v2,  t3> 

However,  the  set  of  possible  worlds  corresponding  to 
this  database  is  disjoint  from  the  correct  set  of  possible 
worlds.  Splitting  the  original  tuple  into  two  alternative 
tuples,  we  obtain  the  following  relation  AB  (before  the 
update). 

A  B  Condition 

▼1  v2  alternative  set  1 

▼1  v3  alternative  set  1 

The  updated  relation  then  becomes  the  following. 

A  B  Condition 

v2  v2  alternative  set  1 

▼3  v3  alternative  set  1 

To  delete  a  tuple  that  is  in  the  “maybe”  result, 
one  could  append  the  possible  condition  and  refine  the 
tuple.  Consider  the  following  relation  and  update. 

Ship  Port 

{Jonny,  fright)  {Boston,  Cairo} 

DELETE  WHERE  Ship  =  Manny* 

With  a  somewhat  clever  query  algorithm,  we  can  first 
split  the  tuple  as  follows. 

Ship  Port  Condition 

Jonny  {Boston,  Cairo}  alternative  set  I 
fright  {Boston,  Cairo}  alternative  set  1 

We  then  delete  the  first  tuple  as  requested.  Notice  that 
the  second  tuple  changes  from  an  alternative  tuple  to  a 
possible  tuple. 

Ship  Port  Condition 

fright  {Boston,  Cairo}  possible 

Deletion  under  the  modified  closed  world  assump¬ 
tion  is  a  very  strong  statement;  a  deletion  based  on  a 
key  value  is  equivalent  to  declaring  that  the  entity  is  no 
longer  in  the  world.  To  delete  a  relationship  between 
entities  that  continue  to  exist,  it  is  better  to  replace 
the  original  relationship  with  one  or  more  relationships 
containing  nulls.  If  this  is  done,  the  original  entities  will 
continue  to  be  known,  but  they  will  be  unrelated. 

In  the  next  section,  we  will  consider  refinement  in 
the  face  of  change- recording  updates. 

4b,  Refinement  in  a  Changing  World 
In  a  static  world,  refinement  is  a  safe  process;  in  a 
dynamic  world,  refinement  must  only  be  done  at  a  cor¬ 
rect  static  state.  For  a  correct  final  dynamic  world 
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database  state  to  be  achieved,  conflicting  updates  must 
be  supplied  in  the  right  order,  and  refinement  must  not 
be  done  until  all  change-recording  updates  correspond¬ 
ing  to  the  same  point  in  time  have  been  accepted — in 
other  words,  until  the  database  corresponds  to  an  actual 
static  world  state. 

In  a  static  world,  refinement  does  not  alTcct  the 
set  of  possible  worlds;  rather,  it  alTccts  the  difficulty 
of  computing  the  set  of  possible  worlds.  On  the  other 
hand,  updates  can  reduce  the  set  of  possible  worlds.  In 
a  static  world,  a  refined  database  is  equivalent  to  its 
unrefined  version,  in  that  they  give  the  same  answers  to 
all  queries.  Note,  however,  that  refinement  may  assist  a 
query  answering  strategy  to  produce  a  more  informative 
answer,  even  though  the  databases  are  equivalent. 

Jn  a  dynamic  world,  refinement  may  affect  the  set 
of  worlds  possible  after  an  update.  Given  a  database, 
refinement  can  produce  a  second  equivalent  database, 
but  after  identical  updates,  the  refined  and  unrefined 
updated  databases  may  no  longer  be  equivalent  [Fagin 
83,  Kuper  84]. 

The  problem  with  refinement  can  arise  when  at- 
tempting  to  store  facts  that  are  general  rules  in  the 
database  along  with  facts  that  are  merely  statements 
of  current  conditions.  For  example,  suppose  that  the 
Kranj  and  the  Totor  alternate  between  Victoria  and 
Vancouver.  Thus,  one  of  these  two  ships  is  always  in 
Vancouver.  However,  we  also  know  what  the  Totor  is 
currently  in  Victoria.  This  results  in  the  following  rela¬ 
tion. 

Ship  Location 

{Kranj ,  Totor}  Vancouver 
Totor  Victoria 

The  relation  after  refinement  follows. 

Ship  Location 

Kranj  Vancouver 

Totor  Victoria 

Suppose  that  the  Totor  moves  to  Vancouver.  This 
changes  the  relation  as  follows. 

Ship  Location 

Kranj  Vancouver 

Totor  Vancouver 

But  if  we  apply  the  update  to  the  unrefined  relation,  wg 
get  a  different  relation.  4 

Ship  Location 

{Kranj,  Totor)  Vancouver  * 

Totor  Vancouver 

Notice  that  this  relation  admits  the  possibility  that’  the 
Kranj  has  moved  lo  Victoria.  This  example  shows  that 
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refinement  can  cause  some  worlds  not  to  be  possible 
when  the  database  undergoes  a  change- recording  up¬ 
date. 

5.  Summary  and  Conclusion 

We  have  considered  how  databases  can  be  used  to  model 
incomplete  knowledge  about  the  world.  Given  a  body 
of  incomplete  knowledge,  there  is  a  set  of  possible 
worlds  that  are  consistent  with  that  knowledge.  The 
modified  closed  world  assumption  allows  the  database 
to  explicitly  state  where  its  information  is  incomplete. 
If  the  database  provides  a  definite  answer  to  a  query, 
we  want  that  answer  to  hold  in  the  real  world  that 
the  database  is  modeling.  Where  knowledge  is  lack¬ 
ing,  the  database  may  have  to  indicate  that  the  answer 
is  “maybe.”  We  would  like  the  database  to  provide 
definite  answers  whenever  possible.  An  answer  to  a 
query  which  is  true  in  all  possible  world  models  of  the 
database  is  considered  the  “true”  result.  Similarly,  the 
“false”  result  is  not  true  in  any  possible  world.  That 
which  is  true  in  some  worlds  and  false  in  others  is  in  the 
“maybe”  result.  Some  query  answering  strategies  may 
not  be  able  to  find  all  the  “true”  and  “false”  results  to 
some  queries,  and  instead  report  an  expanded  “maybe” 
result. 

We  have  considered  extensions  to  the  relational 
database  model  to  support  incompleteness.  Conditional 
relations,  which  consist  of  tuples  whose  existence  is  de¬ 
pendent  on  some  condition,  are  sufficient  to  express 
all  incompleteness  that  can  be  modeled  by  a  set  of  al¬ 
ternative  worlds  each  completely  expressible  by  a  rela¬ 
tional  database.  Unfortunately,  generating  alternative 
worlds  or  answering  queries  for  conditional  relations  is 
quite  complex.  On  the  other  hand,  set  nulls  present  a 
method  for  handling  incomplete  information  for  which 
simpler  query  answering  strategics  exist.  However,  set 
nulls  alone  do  not  have  the  expressive  power  of  condi¬ 
tional  relations.  The  expressive  power  of  set  nulls  can 
be  enhanced  by  using  predicates  which  the  database 
must  satisfy.  Equality  predicates,  usually  modeled  by 
marked  nulls,  are  one  important  form.  Another  useful 
extension  is  to  allow  sets  of  alternative  tuples.  Exactly 
one  tuple  of  each  such  set  must  exist  in  any  model  of 
the  database. 

In  updating  databases  that  model  incomplete 
worlds,  we  distinguish  between  knowledge- adding  up¬ 
dates,  which  reduce  the  set  of  possible  worlds,  and 
change- recording  updates,  which  reflect  other  changes 
in  the  set  of  possible  worlds.  The  former  may  be 
described  as  providing  a  more  complete  model  of  a  static 
world;  they  merely  narrow  down  the  set  of  possible 
worlds.  On  the  other  hand,  a  change-recording  update 
marks  a  transition  to  a  new  set  of  possible  worlds,  and 


is  very  difficult  to  perform  with  an  acceptable  degree 
of  precision.  This  problem  is  exacerbated  by  the  inter¬ 
actions  between  refinement  and  change-recording  up¬ 
dates. 
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